A First Course in Probability: Foundations of Joint Probability Distributions

In the previous lessons, we lived in a one-dimensional world, observing individual random variables in isolation. Now, we expand our horizon to Joint Probability Distributions. Imagine observing a vector of variables simultaneously—like the height and weight of a student, or the coordinates of a dart hitting a board. This framework allows us to mathematically describe how variables interact, depend on one another, or exist in blissful independence.

1. The Joint Cumulative Distribution Function (JCDF)

The foundation of multi-variable analysis is the Joint Distribution Function $F(a_1, a_2, \dots, a_n)$. It defines the probability that multiple conditions are met at the same time.

$F(a_1, a_2, \dots, a_n) = P\{X_1 \le a_1, X_2 \le a_2, \dots, X_n \le a_n\}$

This formula represents the probability that each variable $X_i$ falls below its respective threshold $a_i$ simultaneously. Geometrically, in two dimensions, this is the probability that the random pair $(X, Y)$ falls within the semi-infinite rectangle to the lower-left of the point $(a, b)$.

2. The Infinitesimal Interpretation of Density

For continuous variables, we describe probability through a Joint Probability Density Function (JPDF), $f(x, y)$. Unlike discrete cases, the probability at a single point is zero. Instead, we look at infinitesimal regions:

The probability that a pair $(X, Y)$ falls within a tiny rectangle is given by:
$P\{a < X < a + da, b < Y < b + db\} = \int_{b}^{b+db} \int_{a}^{a+da} f(x, y) \, dx \, dy \approx f(a, b) \, da \, db$
Alternatively expressed as: $P\{x < X < x + dx, y < Y < y + dy\} \approx f(x, y) dx dy$

This reveals that $f(x, y)$ is a "density" relative to the area of the region in the Cartesian plane.

3. Dependency and Geometric Constraints

In probability, Random variables that are not independent are said to be dependent. This is not just an algebraic property; it is often visible in the support of the distribution.

Example 1c: The Random Circle Point

Consider a point $(X, Y)$ chosen uniformly within a circle of radius $R$ centered at $(0,0)$. The variables $X$ and $Y$ are dependent because knowing $X = x$ limits the possible values of $Y$.

If $X$ is near $R$, $Y$ must be near $0$. Mathematically, $Y$ is constrained: $-\sqrt{R^2 - X^2} \le Y \le \sqrt{R^2 - X^2}$. This boundary is what prevents the joint density from being factored into independent marginals.

🎯 Core Insight

Joint distributions define the shared probability space. When the realization of one variable restricts the possible outcomes of another (as in EXAMPLE 1c, 1d, and 1e), we have captured the essence of dependency.

QUESTION 1

Two fair dice are rolled. Let $X$ be the value on the first die and $Y$ be the larger of the two values. What is the joint probability mass function $p(3, 4)$?

$1/36$

$2/36$

$4/36$

QUESTION 2

The joint PDF of $X$ and $Y$ is $f(x, y) = e^{-(x+y)}$ for $x, y \ge 0$. Find $P\{X < Y\}$.

0.5

1.0

0.25

$1/e$

QUESTION 3

In Example 8b, let $Y_{k+1} = n + 1 - \sum_{i=1}^{k} Y_i$. Are $Y_1, \dots, Y_k, Y_{k+1}$ exchangeable?

Yes, because the joint distribution is symmetric across all permutations of indices.

No, because $Y_{k+1}$ is a dependent sum of the others.

Only if the variables are independent.

Only if $n$ is sufficiently large.

QUESTION 4

A man and a woman independently arrive at a location between 12:00 and 1:00 PM (Uniform distribution). What is the probability that the first to arrive waits longer than 10 minutes?

$25/36$

$11/36$

$1/6$

$1/2$

QUESTION 5

If $X \sim Bin(n, p)$ and $Y \sim Bin(m, p)$ are independent, what is the distribution of $X + Y$?

Bin(n + m, p)

Bin(n + m, 2p)

Poisson((n+m)p)

A complex convolution with no simple form